Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print feature flags used for matching pkgimage #50172

Merged
merged 12 commits into from
Aug 7, 2023

Conversation

vchuravy
Copy link
Member

To help with debugging of #50102, #50148, and #49705

julia> @ccall jl_dump_host_cpu()::Cvoid
CPU: znver2
Features: sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, rdrnd, fsgsbase, bmi, avx2, bmi2, rdseed, adx, clflushopt, clwb, sha, rdpid, sahf, lzcnt, sse4a, prfchw, mwaitx, xsaveopt, xsavec, xsaves, clzero, wbnoinvd

julia> target = only(Base.current_image_targets())
znver2; flags=0; features_en=(sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, fsgsbase, bmi, avx2, bmi2, adx, clflushopt, clwb, sha, rdpid, sahf, lzcnt, sse4a, prfchw, mwaitx, xsavec, xsaves, clzero, wbnoinvd)

cc: @mkitti @nrontsis @simonbyrne

@mkitti
Copy link
Contributor

mkitti commented Jun 15, 2023

@bjarthur we should try this and then setup JULIA_CPU_TARGET

@DilumAluthge
Copy link
Member

@vchuravy Seems like this has already proven useful in debugging. Would it be worth merging this into master?

@vchuravy vchuravy marked this pull request as ready for review June 18, 2023 08:59
@vchuravy
Copy link
Member Author

I am not so happy that this makes quite some details visible, I would rather have a more rigorous treatment of CPU features in CPUID, but I suppose it does help debugging.

@vtjnash do you have time to review?

@DilumAluthge DilumAluthge requested a review from vtjnash June 18, 2023 17:34
@DilumAluthge
Copy link
Member

@vchuravy Looks like there are some merge conflicts as well.

@vchuravy vchuravy requested a review from pchintalapudi June 18, 2023 21:18
@pchintalapudi
Copy link
Member

If the goal here is to debug why cache files are being rejected, I wonder if we can simply alter the error message from match_sysimg_targets to tell us more about why each target in the image didn't match. Assuming it's unlikely that more than 3 targets per image are specified, that shouldn't overwhelm the user with too much more data than a flat rejection.

for (uint32_t i = 0; i < sysimg.size(); i++) {
auto &imgt = sysimg[i];
if (!(imgt.en.features & target.dis.features).empty()) {
// Check sysimg enabled features against runtime disabled features
// This is valid (and all what we can do)
// even if one or both of the targets are unknown.
rejection_reasons.push_back("Rejecting this target due to use of runtime-disabled features\n");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! Could we also print which features were disabled?

src/processor.cpp Outdated Show resolved Hide resolved
src/processor.cpp Outdated Show resolved Hide resolved
base/loading.jl Outdated Show resolved Hide resolved
base/loading.jl Outdated Show resolved Hide resolved
@vchuravy vchuravy force-pushed the vc/improve_debug_pkgimage branch from 060bf7b to fba1d72 Compare June 28, 2023 09:42
@vchuravy vchuravy added the backport 1.10 Change should be backported to the 1.10 release label Jun 28, 2023
}
if (match.best_idx == (uint32_t)-1) {
// Construct a nice error message for debugging purposes
std::string error_msg = "Unable to find compatible target in cached code image.\n";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally you would wrap this in a std::stringstream, so that + isn't O(n^2) and you get << syntax (or llvm:: raw_string_ostream if you want that instead to be a generic IO object)

src/processor_fallback.cpp Outdated Show resolved Hide resolved
src/processor_arm.cpp Outdated Show resolved Hide resolved
src/processor_x86.cpp Outdated Show resolved Hide resolved
src/processor_arm.cpp Outdated Show resolved Hide resolved
@vtjnash
Copy link
Member

vtjnash commented Jun 29, 2023

a couple functions need JL_NOTSAFEPOINT annotations added

./processor_x86.cpp:1093:17: error: Calling potential safepoint as SimpleFunctionCall from function annotated JL_NOTSAFEPOINT [julia.GCChecker]
                unset_bits(features_en, fename.bit);
                ^
/cache/build/default-amdci4-3/julialang/julia-master/src/processor.cpp:116:20: note: Tried to call method defined here
static inline void unset_bits(T &bits, T1 _bitidx, Rest... rest)
                   ^
/cache/build/default-amdci4-3/julialang/julia-master/src/processor.cpp:973:18: note: Calling 'jl_get_llvm_clone_targets'
    auto specs = jl_get_llvm_clone_targets();
                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~

@vchuravy vchuravy force-pushed the vc/improve_debug_pkgimage branch from 678708a to c7e9ee8 Compare July 20, 2023 12:37
KristofferC added a commit that referenced this pull request Jul 24, 2023
Backported PRs:
- [x] #50411 <!-- Fix weird dispatch of * with zero arguments -->
- [x] #50202 <!-- Remove dynamic dispatch from _wait/wait2 -->
- [x] #50064 <!-- Fix numbered prompt with input only with comment -->
- [x] #50026 <!-- Store heapsnapshot files in tempdir() instead of
current directory -->
- [x] #50402 <!-- Add CPU feature helper function -->
- [x] #50387 <!-- update newpages pointer after actually sweeping pages
-->
- [x] #50424 <!-- avoid potential type-instability in _replace_(str,
...) -->
- [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some
cases -->
- [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized -->
- [x] #50466 <!-- relax assertion involving pg->nold to reflect that it
may be a bit in… -->
- [x] #50490 <!-- Fix compat annotation for italic printstyled -->
- [x] #50488 <!-- fix typo in `Base.isassigned` with `Tridiagonal` -->
- [x] #50476 <!-- Profile: Add specifying dir for `take_heap_snapshot`
and handling if current dir is unwritable -->
- [x] #50461 <!-- fix typo in the --gcthreads argument description -->
- [x] #50528 <!-- ssair: Correctly handle stmt insertion at end of basic
block -->
- [x] #50533 <!-- ensure internal_obj_base_ptr checks whether objects
past freelist pointer are in freelist -->
- [x] #49322 <!-- improve cat design / performance -->
- [x] #50540 <!-- gc: remove over-eager assertion -->
- [x] #50542 <!-- gf: remove unnecessary assert cycle==depth -->
- [x] #50559 <!-- Expand kwcall lowering positional default check to
vararg -->
- [x] #50058 <!-- Add unwrapping mechanism for triangular mul and solves
-->
- [x] #50551 <!-- typeintersect: also record chained `innervars` -->
- [x] #50552 <!-- read(io, Char): fix read with too many leading ones
-->
- [x] #50541 <!-- precompile: ensure globals are not accidentally
created where disallowed -->
- [x] #50576 <!-- use atomic compare exchange when setting the GC
mark-bit -->
- [x] #50578 <!-- gf: make method overwrite/delete an error during
precompile -->
- [x] #50516 <!-- Fix visibility of assert on GCC12/13 -->
- [x] #50597 <!-- Fix memory corruption if task is launched inside
finalizer -->
- [x] #50591 <!-- build: fix various makefile bugs -->
- [x] #50599 <!-- faster invalid object lookup in conservative gc -->
- [x] #50634 <!-- 🤖 [master] Bump the SparseArrays stdlib from b4b0e72
to 99c99b4 -->
- [x] #50639 <!-- Backport LLVM patches to fix various issues. -->
- [x] #50546 <!-- Revert storage of method instance in LineInfoNode -->
- [x] #50631 <!-- Shift DCE pass to optimize imaging mode code better
-->
- [x] #50525 <!-- only check that values are finite in `generic_lufact`
when `check=true` -->
- [x] #50587 <!-- isassigned for ranges with BigInt indices -->
- [x] #50144 <!-- Page based heap size heuristics -->


Need manual backport:
- [ ] #50595 <!-- Rename ENV variable `JULIA_USE_NEW_PARSER` ->
`JULIA_USE_FLISP_PARSER` -->



Non-merged PRs with backport label:
- [ ] #50637 <!-- Remove SparseArrays legacy code -->
- [ ] #50618 <!-- inference: continue const-prop' when concrete-eval
returns non-inlineable -->
- [ ] #50598 <!-- only limit types in stack traces in the REPL -->
- [ ] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [ ] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [ ] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [ ] #50172 <!-- print feature flags used for matching pkgimage -->
@vchuravy vchuravy force-pushed the vc/improve_debug_pkgimage branch from c7e9ee8 to bcc50ec Compare August 5, 2023 16:27
@vchuravy vchuravy merged commit 958da95 into master Aug 7, 2023
@vchuravy vchuravy deleted the vc/improve_debug_pkgimage branch August 7, 2023 21:48
KristofferC pushed a commit that referenced this pull request Aug 10, 2023
```
julia> @CCall jl_dump_host_cpu()::Cvoid
CPU: znver2
Features: sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, rdrnd, fsgsbase, bmi, avx2, bmi2, rdseed, adx, clflushopt, clwb, sha, rdpid, sahf, lzcnt, sse4a, prfchw, mwaitx, xsaveopt, xsavec, xsaves, clzero, wbnoinvd

julia> target = only(Base.current_image_targets())
znver2; flags=0; features_en=(sse3, pclmul, ssse3, fma, cx16, sse4.1, sse4.2, movbe, popcnt, aes, xsave, avx, f16c, fsgsbase, bmi, avx2, bmi2, adx, clflushopt, clwb, sha, rdpid, sahf, lzcnt, sse4a, prfchw, mwaitx, xsavec, xsaves, clzero, wbnoinvd)
```

Co-authored-by: Prem Chintalapudi <prem.chintalapudi@gmail.com>
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
(cherry picked from commit 958da95)
KristofferC added a commit that referenced this pull request Aug 16, 2023
Backported PRs:
- [x] #50637 <!-- Remove SparseArrays legacy code -->
- [x] #50665 <!-- print `@time` msg into print buffer -->
- [x] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [x] #50635 <!-- `versioninfo()`: include build info and unofficial
warning -->
- [x] #50670 <!-- Make reinterpret specialize fully. -->
- [x] #50666 <!-- include `--pkgimage=no` caches for stdlibs -->
- [x] #50765 
- [x] #50764
- [x] #50768
- [x] #50767
- [x] #50618 <!-- inference: continue const-prop' when concrete-eval
returns non-inlineable -->
- [x] #50689 <!-- Attach `tanpi` docstring to method -->
- [x] #50671 <!-- Fix rdiv of complex lhs by real factorizations -->
- [x] #50598 <!-- only limit types in stack traces in the REPL -->
- [x] #50766 <!-- Don't partition alwaysinline functions -->
- [x] #50771 <!-- re-allow non-string values in ENV `get!` -->
- [x] #50682 <!-- Add fallback if we have make a weird GC decision. -->
- [x] #50781 <!-- fix `bit_map!` with aliasing -->
- [x] #50172 <!-- print feature flags used for matching pkgimage -->
- [x] #50844 <!-- Bump OpenBLAS binaries to use the new GEMM
multithreading threshold -->
- [x] #50826 <!-- Update dependency builds -->
- [x] #50845 <!-- fix #50438, use default pool for at-threads -->
- [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [x] #50655 <!-- fix hashing regression. -->
- [x] #50779 <!-- Minor refactor to image generation -->
- [x] #50791 <!-- Make symbols internal in jl_create_native, and only
externalize them when partitioning -->
- [x] #50724 <!-- Merge opaque closure modules with the rest of the
workqueue -->
- [x] #50738 <!-- Add alignment to constant globals -->
- [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception
handling. -->

Need manual backport:

Contains multiple commits, manual intervention needed:

Non-merged PRs with backport label:
- [ ] #50850 <!-- Remove weird Rational dispatch and add pi functions to
list -->
- [ ] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [ ] #50809 <!-- Limit type-printing in MethodError -->
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [ ] #50385 <!-- Precompile pidlocks: add to NEWS and docs -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
@KristofferC KristofferC removed the backport 1.10 Change should be backported to the 1.10 release label Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants